Overview

Brought to you by YData

Dataset statistics

Number of variables10
Number of observations101
Missing cells18
Missing cells (%)1.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory42.2 KiB
Average record size in memory427.7 B

Variable types

Text3
Categorical3
Numeric2
DateTime1
Boolean1

Alerts

department is highly overall correlated with first_name and 2 other fieldsHigh correlation
first_name is highly overall correlated with department and 2 other fieldsHigh correlation
is_active is highly overall correlated with department and 3 other fieldsHigh correlation
last_name is highly overall correlated with department and 2 other fieldsHigh correlation
salary is highly overall correlated with is_activeHigh correlation
is_active is highly imbalanced (50.3%) Imbalance
email has 7 (6.9%) missing values Missing
phone has 6 (5.9%) missing values Missing
age has 5 (5.0%) missing values Missing
department is uniformly distributed Uniform
customer_id has unique values Unique
hire_date has unique values Unique

Reproduction

Analysis started2025-08-28 07:44:50.628425
Analysis finished2025-08-28 07:44:53.974615
Duration3.35 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

customer_id
Text

Unique 

Distinct101
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
2025-08-28T07:44:54.438921image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1010
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)100.0%

Sample

1st rowCUST_01000
2nd rowCUST_01001
3rd rowCUST_01002
4th rowCUST_01003
5th rowCUST_01004
ValueCountFrequency (%)
cust_01012 1
 
1.0%
cust_01100 1
 
1.0%
cust_01000 1
 
1.0%
cust_01001 1
 
1.0%
cust_01002 1
 
1.0%
cust_01093 1
 
1.0%
cust_01094 1
 
1.0%
cust_01095 1
 
1.0%
cust_01096 1
 
1.0%
cust_01097 1
 
1.0%
Other values (91) 91
90.1%
2025-08-28T07:44:55.497791image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 223
22.1%
1 122
12.1%
C 101
10.0%
S 101
10.0%
U 101
10.0%
_ 101
10.0%
T 101
10.0%
2 20
 
2.0%
3 20
 
2.0%
8 20
 
2.0%
Other values (5) 100
9.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1010
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 223
22.1%
1 122
12.1%
C 101
10.0%
S 101
10.0%
U 101
10.0%
_ 101
10.0%
T 101
10.0%
2 20
 
2.0%
3 20
 
2.0%
8 20
 
2.0%
Other values (5) 100
9.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1010
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 223
22.1%
1 122
12.1%
C 101
10.0%
S 101
10.0%
U 101
10.0%
_ 101
10.0%
T 101
10.0%
2 20
 
2.0%
3 20
 
2.0%
8 20
 
2.0%
Other values (5) 100
9.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1010
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 223
22.1%
1 122
12.1%
C 101
10.0%
S 101
10.0%
U 101
10.0%
_ 101
10.0%
T 101
10.0%
2 20
 
2.0%
3 20
 
2.0%
8 20
 
2.0%
Other values (5) 100
9.9%

first_name
Categorical

High correlation 

Distinct11
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
John
10 
Jane
10 
Bob
10 
Alice
10 
Charlie
10 
Other values (6)
51 

Length

Max length7
Median length5
Mean length4.5940594
Min length3

Characters and Unicode

Total characters464
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st rowJohn
2nd rowJane
3rd rowBob
4th rowAlice
5th rowCharlie

Common Values

ValueCountFrequency (%)
John 10
9.9%
Jane 10
9.9%
Bob 10
9.9%
Alice 10
9.9%
Charlie 10
9.9%
Diana 10
9.9%
Eve 10
9.9%
Frank 10
9.9%
Grace 10
9.9%
Henry 10
9.9%

Length

2025-08-28T07:44:56.146450image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
john 10
9.9%
jane 10
9.9%
bob 10
9.9%
alice 10
9.9%
charlie 10
9.9%
diana 10
9.9%
eve 10
9.9%
frank 10
9.9%
grace 10
9.9%
henry 10
9.9%

Most occurring characters

ValueCountFrequency (%)
a 61
13.1%
e 60
12.9%
n 50
10.8%
r 40
 
8.6%
i 30
 
6.5%
h 20
 
4.3%
o 20
 
4.3%
c 20
 
4.3%
J 20
 
4.3%
l 20
 
4.3%
Other values (13) 123
26.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 464
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 61
13.1%
e 60
12.9%
n 50
10.8%
r 40
 
8.6%
i 30
 
6.5%
h 20
 
4.3%
o 20
 
4.3%
c 20
 
4.3%
J 20
 
4.3%
l 20
 
4.3%
Other values (13) 123
26.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 464
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 61
13.1%
e 60
12.9%
n 50
10.8%
r 40
 
8.6%
i 30
 
6.5%
h 20
 
4.3%
o 20
 
4.3%
c 20
 
4.3%
J 20
 
4.3%
l 20
 
4.3%
Other values (13) 123
26.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 464
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 61
13.1%
e 60
12.9%
n 50
10.8%
r 40
 
8.6%
i 30
 
6.5%
h 20
 
4.3%
o 20
 
4.3%
c 20
 
4.3%
J 20
 
4.3%
l 20
 
4.3%
Other values (13) 123
26.5%

last_name
Categorical

High correlation 

Distinct11
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
Smith
10 
Johnson
10 
Williams
10 
Brown
10 
Jones
10 
Other values (6)
51 

Length

Max length9
Median length8
Mean length6.3960396
Min length5

Characters and Unicode

Total characters646
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st rowSmith
2nd rowJohnson
3rd rowWilliams
4th rowBrown
5th rowJones

Common Values

ValueCountFrequency (%)
Smith 10
9.9%
Johnson 10
9.9%
Williams 10
9.9%
Brown 10
9.9%
Jones 10
9.9%
Garcia 10
9.9%
Miller 10
9.9%
Davis 10
9.9%
Rodriguez 10
9.9%
Martinez 10
9.9%

Length

2025-08-28T07:44:56.306487image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
smith 10
9.9%
johnson 10
9.9%
williams 10
9.9%
brown 10
9.9%
jones 10
9.9%
garcia 10
9.9%
miller 10
9.9%
davis 10
9.9%
rodriguez 10
9.9%
martinez 10
9.9%

Most occurring characters

ValueCountFrequency (%)
i 81
12.5%
o 51
 
7.9%
n 51
 
7.9%
a 50
 
7.7%
r 50
 
7.7%
s 41
 
6.3%
l 41
 
6.3%
e 40
 
6.2%
J 20
 
3.1%
m 20
 
3.1%
Other values (16) 201
31.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 646
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 81
12.5%
o 51
 
7.9%
n 51
 
7.9%
a 50
 
7.7%
r 50
 
7.7%
s 41
 
6.3%
l 41
 
6.3%
e 40
 
6.2%
J 20
 
3.1%
m 20
 
3.1%
Other values (16) 201
31.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 646
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 81
12.5%
o 51
 
7.9%
n 51
 
7.9%
a 50
 
7.7%
r 50
 
7.7%
s 41
 
6.3%
l 41
 
6.3%
e 40
 
6.2%
J 20
 
3.1%
m 20
 
3.1%
Other values (16) 201
31.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 646
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 81
12.5%
o 51
 
7.9%
n 51
 
7.9%
a 50
 
7.7%
r 50
 
7.7%
s 41
 
6.3%
l 41
 
6.3%
e 40
 
6.2%
J 20
 
3.1%
m 20
 
3.1%
Other values (16) 201
31.1%

email
Text

Missing 

Distinct94
Distinct (%)100.0%
Missing7
Missing (%)6.9%
Memory size6.5 KiB
2025-08-28T07:44:56.709287image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Length

Max length19
Median length18
Mean length17.914894
Min length17

Characters and Unicode

Total characters1684
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)100.0%

Sample

1st rowuser1@example.com
2nd rowuser2@example.com
3rd rowuser3@example.com
4th rowuser4@example.com
5th rowuser5@example.com
ValueCountFrequency (%)
user37@example.com 1
 
1.1%
user73@example.com 1
 
1.1%
user72@example.com 1
 
1.1%
user71@example.com 1
 
1.1%
user70@example.com 1
 
1.1%
user69@example.com 1
 
1.1%
user40@example.com 1
 
1.1%
user39@example.com 1
 
1.1%
user38@example.com 1
 
1.1%
user24@example.com 1
 
1.1%
Other values (84) 84
89.4%
2025-08-28T07:44:57.778165image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 282
16.7%
m 188
11.2%
u 94
 
5.6%
r 94
 
5.6%
@ 94
 
5.6%
x 94
 
5.6%
s 94
 
5.6%
l 94
 
5.6%
. 94
 
5.6%
a 94
 
5.6%
Other values (13) 462
27.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1684
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 282
16.7%
m 188
11.2%
u 94
 
5.6%
r 94
 
5.6%
@ 94
 
5.6%
x 94
 
5.6%
s 94
 
5.6%
l 94
 
5.6%
. 94
 
5.6%
a 94
 
5.6%
Other values (13) 462
27.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1684
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 282
16.7%
m 188
11.2%
u 94
 
5.6%
r 94
 
5.6%
@ 94
 
5.6%
x 94
 
5.6%
s 94
 
5.6%
l 94
 
5.6%
. 94
 
5.6%
a 94
 
5.6%
Other values (13) 462
27.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1684
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 282
16.7%
m 188
11.2%
u 94
 
5.6%
r 94
 
5.6%
@ 94
 
5.6%
x 94
 
5.6%
s 94
 
5.6%
l 94
 
5.6%
. 94
 
5.6%
a 94
 
5.6%
Other values (13) 462
27.4%

phone
Text

Missing 

Distinct95
Distinct (%)100.0%
Missing6
Missing (%)5.9%
Memory size6.3 KiB
2025-08-28T07:44:58.657430image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters1425
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)100.0%

Sample

1st row+1-555-202-1860
2nd row+1-555-370-6191
3rd row+1-555-800-6734
4th row+1-555-221-1466
5th row+1-555-314-5426
ValueCountFrequency (%)
1-555-921-7873 1
 
1.1%
1-555-653-8035 1
 
1.1%
1-555-492-1206 1
 
1.1%
1-555-497-7938 1
 
1.1%
1-555-602-7910 1
 
1.1%
1-555-104-8385 1
 
1.1%
1-555-561-4840 1
 
1.1%
1-555-369-8629 1
 
1.1%
1-555-301-1995 1
 
1.1%
1-555-655-1161 1
 
1.1%
Other values (85) 85
89.5%
2025-08-28T07:44:59.603433image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 348
24.4%
- 285
20.0%
1 161
11.3%
+ 95
 
6.7%
6 83
 
5.8%
8 76
 
5.3%
4 72
 
5.1%
3 68
 
4.8%
9 68
 
4.8%
2 62
 
4.4%
Other values (2) 107
 
7.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1425
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
5 348
24.4%
- 285
20.0%
1 161
11.3%
+ 95
 
6.7%
6 83
 
5.8%
8 76
 
5.3%
4 72
 
5.1%
3 68
 
4.8%
9 68
 
4.8%
2 62
 
4.4%
Other values (2) 107
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1425
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
5 348
24.4%
- 285
20.0%
1 161
11.3%
+ 95
 
6.7%
6 83
 
5.8%
8 76
 
5.3%
4 72
 
5.1%
3 68
 
4.8%
9 68
 
4.8%
2 62
 
4.4%
Other values (2) 107
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1425
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
5 348
24.4%
- 285
20.0%
1 161
11.3%
+ 95
 
6.7%
6 83
 
5.8%
8 76
 
5.3%
4 72
 
5.1%
3 68
 
4.8%
9 68
 
4.8%
2 62
 
4.4%
Other values (2) 107
 
7.5%

age
Real number (ℝ)

Missing 

Distinct49
Distinct (%)51.0%
Missing5
Missing (%)5.0%
Infinite0
Infinite (%)0.0%
Mean49.09375
Minimum18
Maximum79
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size940.0 B
2025-08-28T07:45:00.252235image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile19
Q133.75
median49
Q366.25
95-th percentile77
Maximum79
Range61
Interquartile range (IQR)32.5

Descriptive statistics

Standard deviation19.273149
Coefficient of variation (CV)0.39257847
Kurtosis-1.2700166
Mean49.09375
Median Absolute Deviation (MAD)17
Skewness-0.063984086
Sum4713
Variance371.45428
MonotonicityNot monotonic
2025-08-28T07:45:00.427297image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
66 5
 
5.0%
69 4
 
4.0%
19 4
 
4.0%
49 4
 
4.0%
76 4
 
4.0%
77 4
 
4.0%
20 3
 
3.0%
36 3
 
3.0%
50 3
 
3.0%
43 3
 
3.0%
Other values (39) 59
58.4%
(Missing) 5
 
5.0%
ValueCountFrequency (%)
18 2
2.0%
19 4
4.0%
20 3
3.0%
21 1
 
1.0%
22 1
 
1.0%
23 3
3.0%
24 2
2.0%
26 1
 
1.0%
28 3
3.0%
29 1
 
1.0%
ValueCountFrequency (%)
79 2
2.0%
77 4
4.0%
76 4
4.0%
75 1
 
1.0%
74 2
2.0%
73 2
2.0%
72 2
2.0%
71 1
 
1.0%
70 1
 
1.0%
69 4
4.0%

salary
Real number (ℝ)

High correlation 

Distinct98
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82866.842
Minimum-999
Maximum149181
Zeros0
Zeros (%)0.0%
Negative4
Negative (%)4.0%
Memory size940.0 B
2025-08-28T07:45:00.608141image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Quantile statistics

Minimum-999
5-th percentile32200
Q151357
median78404
Q3112989
95-th percentile146336
Maximum149181
Range150180
Interquartile range (IQR)61632

Descriptive statistics

Standard deviation39147.261
Coefficient of variation (CV)0.47241165
Kurtosis-0.74935224
Mean82866.842
Median Absolute Deviation (MAD)29101
Skewness-0.044370351
Sum8369551
Variance1.5325081 × 109
MonotonicityNot monotonic
2025-08-28T07:45:00.789454image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-999 4
 
4.0%
106213 1
 
1.0%
104290 1
 
1.0%
33436 1
 
1.0%
77333 1
 
1.0%
109909 1
 
1.0%
146381 1
 
1.0%
36893 1
 
1.0%
47429 1
 
1.0%
49830 1
 
1.0%
Other values (88) 88
87.1%
ValueCountFrequency (%)
-999 4
4.0%
32049 1
 
1.0%
32200 1
 
1.0%
32557 1
 
1.0%
32869 1
 
1.0%
33436 1
 
1.0%
33987 1
 
1.0%
34499 1
 
1.0%
35539 1
 
1.0%
35600 1
 
1.0%
ValueCountFrequency (%)
149181 1
1.0%
149121 1
1.0%
147858 1
1.0%
147796 1
1.0%
146381 1
1.0%
146336 1
1.0%
142893 1
1.0%
142476 1
1.0%
142296 1
1.0%
139616 1
1.0%

department
Categorical

High correlation  Uniform 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
Sales
21 
Marketing
20 
Engineering
20 
HR
20 
Finance
20 

Length

Max length11
Median length7
Mean length6.7821782
Min length2

Characters and Unicode

Total characters685
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSales
2nd rowMarketing
3rd rowEngineering
4th rowHR
5th rowFinance

Common Values

ValueCountFrequency (%)
Sales 21
20.8%
Marketing 20
19.8%
Engineering 20
19.8%
HR 20
19.8%
Finance 20
19.8%

Length

2025-08-28T07:45:00.954402image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-08-28T07:45:01.078146image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
ValueCountFrequency (%)
sales 21
20.8%
marketing 20
19.8%
engineering 20
19.8%
hr 20
19.8%
finance 20
19.8%

Most occurring characters

ValueCountFrequency (%)
n 120
17.5%
e 101
14.7%
i 80
11.7%
a 61
8.9%
g 60
8.8%
r 40
 
5.8%
S 21
 
3.1%
s 21
 
3.1%
l 21
 
3.1%
t 20
 
2.9%
Other values (7) 140
20.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 685
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 120
17.5%
e 101
14.7%
i 80
11.7%
a 61
8.9%
g 60
8.8%
r 40
 
5.8%
S 21
 
3.1%
s 21
 
3.1%
l 21
 
3.1%
t 20
 
2.9%
Other values (7) 140
20.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 685
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 120
17.5%
e 101
14.7%
i 80
11.7%
a 61
8.9%
g 60
8.8%
r 40
 
5.8%
S 21
 
3.1%
s 21
 
3.1%
l 21
 
3.1%
t 20
 
2.9%
Other values (7) 140
20.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 685
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 120
17.5%
e 101
14.7%
i 80
11.7%
a 61
8.9%
g 60
8.8%
r 40
 
5.8%
S 21
 
3.1%
s 21
 
3.1%
l 21
 
3.1%
t 20
 
2.9%
Other values (7) 140
20.4%

hire_date
Date

Unique 

Distinct101
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size940.0 B
Minimum2016-03-04 00:00:00
Maximum2025-08-24 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-08-28T07:45:01.246608image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
2025-08-28T07:45:01.432584image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

is_active
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size233.0 B
True
90 
False
11 
ValueCountFrequency (%)
True 90
89.1%
False 11
 
10.9%
2025-08-28T07:45:01.568978image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Interactions

2025-08-28T07:44:52.065738image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
2025-08-28T07:44:51.444033image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
2025-08-28T07:44:52.615717image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
2025-08-28T07:44:51.715599image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/

Correlations

2025-08-28T07:45:01.886485image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
agedepartmentfirst_nameis_activelast_namesalary
age1.0000.0000.1190.0000.1190.092
department0.0001.0000.9680.6560.9680.161
first_name0.1190.9681.0000.9531.0000.227
is_active0.0000.6560.9531.0000.9530.552
last_name0.1190.9681.0000.9531.0000.227
salary0.0920.1610.2270.5520.2271.000

Missing values

2025-08-28T07:44:52.904778image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
A simple visualization of nullity by column.
2025-08-28T07:44:53.218376image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-08-28T07:44:53.821502image/svg+xmlMatplotlib v3.10.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

customer_idfirst_namelast_nameemailphoneagesalarydepartmenthire_dateis_active
0CUST_01000JohnSmithNaNNaNNaN-999Sales2019-10-13False
1CUST_01001JaneJohnsonuser1@example.com+1-555-202-186066.032049Marketing2020-10-25True
2CUST_01002BobWilliamsuser2@example.com+1-555-370-619175.061616Engineering2022-10-03True
3CUST_01003AliceBrownuser3@example.com+1-555-800-673469.0133727HR2019-04-15True
4CUST_01004CharlieJonesuser4@example.com+1-555-221-146629.0142893Finance2016-04-17True
5CUST_01005DianaGarciauser5@example.com+1-555-314-542679.050932Sales2018-09-10True
6CUST_01006EveMilleruser6@example.com+1-555-558-932256.0147796Marketing2024-10-31True
7CUST_01007FrankDavisuser7@example.com+1-555-761-176919.059855Engineering2025-03-01True
8CUST_01008GraceRodriguezuser8@example.com+1-555-443-794920.091434HR2023-12-31True
9CUST_01009HenryMartinezuser9@example.com+1-555-485-631166.0102694Finance2016-12-23True
customer_idfirst_namelast_nameemailphoneagesalarydepartmenthire_dateis_active
91CUST_01091JaneJohnsonuser91@example.com+1-555-484-230624.0132946Marketing2019-05-01True
92CUST_01092BobWilliamsuser92@example.com+1-555-732-686477.0139616Engineering2017-02-07True
93CUST_01093AliceBrownuser93@example.com+1-555-358-852633.0135983HR2021-04-16True
94CUST_01094CharlieJonesuser94@example.com+1-555-809-657543.0103744Finance2017-04-21True
95CUST_01095DianaGarciauser95@example.com+1-555-510-541365.086491Sales2025-08-24True
96CUST_01096EveMilleruser96@example.com+1-555-776-166374.048589Marketing2022-07-29True
97CUST_01097FrankDavisuser97@example.com+1-555-926-249569.073484Engineering2022-04-27True
98CUST_01098GraceRodriguezuser98@example.com+1-555-332-476377.0112989HR2023-07-17True
99CUST_01099HenryMartinezuser99@example.com+1-555-212-285366.066212Finance2023-04-05True
100CUST_01100EmmaWilsonuser100@example.comNaNNaN73525Sales2022-12-11False